Advanced research methods

Session 1

Caspar David Peter

Rotterdam School of Management, Accounting Department

Introduction

Introduction

Motivation

Why is research exciting?

“(…) the world is our lab, and the many diverse people in it are our subjects” (Angrist, 2015)

What are accounting researchers investigating?

  • What is the impact of digitalization on audit practices
  • What are the real effects of the disclosure regulation?
  • How do firms react to tax changes?
  • What is the impact of digitalization on audit practices?

Introduction

Goal

Uncover the causal effect of an intervention

Introduction

How do we find the answers?

We use data to answer these “cause-and-effect” questions!

Introduction

How do we find the answers?

We use data to answer these “cause-and-effect” questions!

Introduction

How do we find the answers?

We use data to answer these “cause-and-effect” questions!

Correlation \(\neq\) Causation

Trust the process

Theory, hypotheses, and operationalisation

Theory

Theory, hypotheses, and operationalisation

Theory

Conceptual level

  • Theory to explain the relationship
  • Formulate a hypothesis

Operational level

  • Measure the variables
  • Test the hypothesis

Theory, hypotheses, and operationalisation

What is a hypothesis?

  • A hypothesis is a proposed explanation for a phenomenon or a prediction of a possible causal correlation
  • It is a tentative answer to a research question that guides the direction of study and investigation
  • Its formulation is guided by existing (economic) theory
  • A well-constructed hypothesis is testable, meaning it can be supported or rejected through experimentation

Theory, hypotheses, and operationalisation

How do we test a hypothesis?

  • We need a baseline to test against: The null hypothesis (\(H_0\))
  • Why null? It states that there is no effect or no difference, which serves as a baseline assumption
    • \(H_0\) is what you test your \(H_1\) against, e.g. \(H_0 = 0\)
    • \(H_1\) is the/your alternative hypothesis - what you might believe to be true
  • A statistical test rejects \(H_0\) if there is enough evidence against it
    • If we reject \(H_0\), we accept \(H_1\)

The t-test measures how far the sample mean is away from \(H_0\)

Hypothesis:

  • One-sided (\(H_0 \leq 0\), \(H_1>0\)): \(t = \frac{X-0}{SE(X)}\)
  • Two-sided (\(H_0 = 0\), \(H_1 \neq 0\)): \(t = \frac{X_{treated}-X_{control}}{SE(X_{treated}-X_{control})}\)

Operationalisation

Causal inference

A small Becksperiment

Research question: Does beer affect performance?

  • \(H_0\): Beer consumption does not affect performance
  • \(H_1\): Beer consumption affects performance

Becksperiment

What is the problem?

Simple group comparison

  • The results suggest that, on average, beer drinking improves grades.
  • But is this true? Did we find the causal effect of beer drinking on grades?
becks_data %>% 
  group_by(treatment) %>% 
  summarise(across("performance" , ~ mean(.x))) %>% 
  mutate(beer = case_when(treatment == 1 ~ "Beer", 
                          treatment == 0 ~ "No beer"),
           # Cosmetic change to reorder so "Beer" is on the left
         beer = factor(beer, levels = c("Beer", "No beer"))) %>%
  ggplot(aes(x = beer, y = performance, group = beer, fill = beer)) +
  geom_col() +
  scale_y_continuous(breaks = seq(0,10,1)) +
  scale_fill_manual(name = "", values = c("#F7AB64", "#93B8D6")) +
  labs(x="", y="Grade") +
  ggthemes::theme_hc(base_size = 18) + 
  theme(legend.position="") +
  ylim(0,10)

Why can simple group comparison be misleading?

Becksperiment

How comparable are the groups, on average?

Variable Beer No beer Difference t-stat. p-value
Performance 8.44 7.00 1.43 -10.09 0.00
Education 3.73 2.92 0.82 -6.54 0.00
Gender 0.40 0.40 0.00 -0.11 0.91

Becksperiment

Do other factors play a role?

# Plot results for simple group comparison - condition on observed factors

becks_data %>% 
  mutate(high_edu = case_when(education > mean(becks_data$education) ~ 1, 
                              education <=  mean(becks_data$education) ~ 0) %>% as.character()) %>% 
  group_by(treatment, high_edu) %>% 
  summarise(across("performance" , ~ mean(.x))) %>% 
  mutate(beer = case_when(treatment == 1 ~ "Beer", 
                          treatment == 0 ~ "No beer"),
           # Cosmetic change to reorder so "Beer" is on the left
         beer = factor(beer, levels = c("Beer", "No beer"))) %>%
  ggplot(aes(x = beer, y = performance, group = high_edu, color = high_edu)) +
  geom_point(size = 2) +
  geom_line(aes(group = high_edu), linetype = "dashed", size = 1.5 ) +
  scale_y_continuous(breaks = seq(0,10,1)) +
  scale_color_manual(values = c("0" = "#93B8D6", "1" = "#F7AB64"), name = "High education") +
  labs(x="", y="Grade") +
  ggthemes::theme_hc(base_size = 18) + 
  theme(legend.position="bottom")+
  ylim(4,10)

Becksperiment

Do other factors play a role?

becks_data %>% 
  group_by(treatment, gender) %>% 
  mutate(gender = as.character(gender)) %>% 
  summarise(across("performance" , ~ mean(.x))) %>% 
  mutate(beer = case_when(treatment == 1 ~ "Beer", 
                          treatment == 0 ~ "No beer"),
           # Cosmetic change to reorder so "Beer" is on the left
         beer = factor(beer, levels = c("Beer", "No beer"))) %>%
  ggplot(aes(x = beer, y = performance, group = gender, color = gender)) +
  geom_point(size = 2) +
  geom_line(aes(group = gender), linetype = "dashed", size = 1.5) +
  scale_y_continuous(breaks = seq(0,10,1)) +
  scale_color_manual(values = c("0" = "#93B8D6", "1" = "#F7AB64"), name = "Gender") +
  labs(x="", y="Grade") +
  ggthemes::theme_hc(base_size = 18) + 
  theme(legend.position="bottom")+
  ylim(4,10)

Becksperiment

Do unobservable factors play a role?

Variable Beer No beer Difference t-stat. p-value
Performance 8.44 7.00 1.43 -10.09 0.00
Talent 5.79 3.28 2.51 -24.76 0.00
Education 3.73 2.92 0.82 -6.54 0.00
Gender 0.40 0.40 0.00 -0.11 0.91

library(ggcorrplot)
# Calculate correlations 
# I make a copy of the dataset for nicer names in the correlation plot
becks_data_corr <- becks_data 
names(becks_data_corr) <- names(becks_data) %>% str_to_sentence()

# Calculate correlations 
becks_data_corr <- becks_data 
names(becks_data_corr) <- names(becks_data) %>% str_to_sentence()

corr <- 
  becks_data_corr %>% 
  select(- Student , -Treatment_random) %>% 
  cor(.) %>% 
  round(.,2)

ggcorrplot(corr, hc.order = TRUE,type = "upper", lab = TRUE) + 
  xlab("") + ylab("") +
  ggthemes::theme_hc(base_size = 16) + 
  theme(legend.position="bottom") 

rm(becks_data_corr)

How can we get rid of the selection bias?

Becksperiment

Randomly assigning beer drinking

Variable Beer No beer Difference t-stat. p-value
Performance 8.44 7.00 1.43 0.12 0.90
Education 3.73 2.92 0.82 -0.23 0.82
Gender 0.40 0.40 0.00 0.40 0.69
Talent 5.79 3.28 2.51 0.63 0.53
becks_data %>% 
  group_by(treatment_random) %>% 
  summarise(across("performance" , ~ mean(.x))) %>% 
  mutate(beer = case_when(treatment_random == 1 ~ "Beer", 
                          treatment_random == 0 ~ "No beer"),
         # Cosmetic change to reorder so "Beer" is on the left
         beer = factor(beer, levels = c("Beer", "No beer"))) %>%
  ggplot(aes(x = beer, y = performance, group = beer, fill = beer)) +
  geom_col() +
  scale_y_continuous(breaks = seq(0,10,1)) +
  scale_fill_manual(name = "", values = c("#F7AB64", "#93B8D6")) +
  labs(x="", y="Grade") +
  ggthemes::theme_hc(base_size = 18) + 
  theme(legend.position="bottom") +
  ylim(0,10)

Becksperiment

Takeaways

  • Random assignment assures that the effect is not due to selection bias
  • The difference in talent between the groups is statistically insignificant
    • The selection bias disappears

The gold standard: Experiments

Why is randomization so important?

  • Controlling variation in the causal variable, e.g. beer drinking

  • Makes sure that the treatment and control group are similar along observable and unobservable dimensions

  • The only difference between the two groups is the treatment

  • This allows us to attribute any difference in outcomes to the treatment

  • No selection bias (endogeneity issue)

The gold standard: Experiments

Designing and analyzing experiments

Types of experiments:

  • Field experiments
    • aim to be as similar as possible to real-world decision situations
  • A/B testing
    • aim to evaluate different versions of the same product
  • Lab experiments
    • are carried out in an artificial environment, usually a computer lab

The gold standard: Experiments

Setup of an experiment

  • Random assignment of treatment via assignment rule
  • Number of subjects? Ideally large, exact number via power analysis
    • How many subjects do I need to detect a certain effect size?
  • Proportion of treated subjects? Ideally 50/50
  • Covariate balance
    • Did random assignment work?

The gold standard: Experiments

External and internal validity

  • Internal validity
    • Extent to which experiment identifies causal effect of treatment
  • External validity
    • Extent to which results generalize to other situations

Experiments in practice

  • Design can be complicated
    • Intention to treat versus actual treatment
  • Experiments need careful planning
    • Results uninformative or misleading if poorly designed
  • Analysis is relatively simple to do and to communicate

The gold standard: Experiments

Takeaways

  • Random assignment assures that effect is not due to selection bias
  • In the experiment you control
    • who receives the treatment (the treatment group),…
    • and who doesn’t (the control group)
  • You can control other aspects of the situation to avoid other confounders
  • You can measure the effect of the treatment on the outcome
  • Potential trade-off between internal and external validity

What to do if controlling random assignment is not possible?

Difference-in-differences

Difference-in-differences

What does it look like?

Difference-in-differences

What is the idea behind DiD?

(1) After (2) Before (1) - (2)
(a) Treatment Y\(_{treated,\ after}\) Y\(_{treated,\ before}\) \(\Delta_{treated}\)
(b) Control Y\(_{control,\ after}\) Y\(_{control,\ before}\) \(\Delta_{control}\)
(a) - (b) \(\Delta_{after}\) \(\Delta_{before}\) DiD

Difference-in-differences

A typical DiD regression looks like this

\[Y = \beta_0 + \beta_1 Treated + \beta_2 After + \beta_3 Treated \times After + \epsilon\]

  • The difference-in-differences regression gives you the same estimate as if you took differences in the group averages

  • It takes also care of any unobserved constant differences between subjects and time trends!

Difference-in-differences

What do the coefficients tell us?

\[Y = \beta_0 + \beta_1 Treated + \beta_2 After + \beta_3 Treated \times After + \epsilon\]

(1) After (2) Before (1) - (2)
(a) Treatment \(\beta_0 + \beta_1+\beta_2+\beta_3\) \(\beta_0 + \beta_1\) \(\beta_2+\beta_3\)
(b) Control \(\beta_0 + \beta_2\) \(\beta_0\) \(\beta_2\)
(a) - (b) \(\beta_1+\beta_3\) \(\beta_1\) \(\beta_3\)

Difference-in-differences (DiD)

DiD in action

“High” Achievers?

Cannabis Access and Academic Performance

by Marie and Zölitz (2017)

Difference-in-differences (DiD)

DiD in action

“In order to estimate the effect of legal cannabis access on student performance, we exploit a unique natural experiment that temporarily discriminated legal access to cannabis based on nationality. We apply a difference-in-differences approach across time and nationality groups.”

  • “drug tourists” from Belgium, Germany, Luxembourg, and France
  • substantial part of the city’s population are students
  • … about 16,000 individuals studying at Maastricht University, > 50% whom are non-Dutch nationals
  • Academic performance of students who are no longer legally permitted to buy cannabis increases

  • Grade improvements are driven by younger students

  • Effects are stronger for women and low performers

  • Performance gains are larger for courses that require more numerical/mathematical skills

  • Performance gains are driven by an improved understanding not than changes in students’ study effort

Difference-in-differences

Effect of policy change on student grades

# Eye-test 

did_data  %>% 
  group_by(treatment, year)  %>%
  summarise(performance = mean(performance, rm.na = TRUE))  %>% 
  mutate(type  = case_when(treatment == 1 ~ "Treated group",
                           TRUE ~ "Control group") ) %>%
  ggplot(aes(x = year, y = performance, color = type) ) +
  geom_point(size = 2) + 
  geom_line(size = 1) +
  scale_color_manual(values = c( "#93B8D6", "#F7AB64"), name = "" ) +
  geom_vline(xintercept = 0, linetype = "dashed", color = "grey") +
  scale_y_continuous(breaks = seq(4, 8, .5),limits = c(5, 8) ) +
  ggthemes::theme_hc(base_size = 18) + 
  labs( x = "Time", y ="Grade") +
  theme(legend.position="bottom")

Difference-in-differences

Let’s have a look at simple averages

Table 1: Effect of policy change on student grades
After Before Difference
Treated group 7.11 6.02 1.09
Control group 7.49 7.51 -0.02
Difference -0.37 -1.49 1.11
Table 2: Effect of policy change on student grades
# Descriptive statistics
table_did <- 
did_data  %>% 
  group_by(treatment, after) %>%
  summarise(performance = mean(performance, rm.na = TRUE))  %>% 
  spread(after, performance) %>%
  mutate(Difference = `1` - `0`) %>% 
  rename(" " = treatment,
         "After" = `1`,
         "Before" = `0`) %>% 
  select(" ", "After", "Before", Difference) %>%
  mutate(" " = case_when(` ` == 1 ~ "Treated group",
                         TRUE ~ "Control group") ) %>% 
  arrange(Before)

table_did <- table_did %>% bind_rows(
  # Add difference column
  tibble(
    " " = "Difference",
    After = table_did$After[1] - table_did$After[2],
    Before = table_did$Before[1] - table_did$Before[2],
    Difference = After - Before))

table_did %>% 
  kable(digits = 2) %>% 
  kable_styling(., "striped", position = "left", font_size = 35)

Difference-in-differences

Let’s check the OLS results

\(\text{Treatment} \times \text{After}\)

  • shows the difference-in-differences
  • the effect is statistically significant
  • fixed effects regression confirms the result

Crucial assumption: Parallel trends between treatment and control in the pre-period!

# Set Table style 
  
  # Dictionary => set only once per session
dict <- setFixest_dict( c(performance            = "Grade",
                          treatment          = "Treatment",
                          after              = "After",
                          student            = "Student ID",
                          year               = "Year"
))

# The style of the table
my_style = style.tex(tpt = TRUE, 
                     notes.tpt.intro = "\\footnotesize")
setFixest_etable(style.tex = my_style, markdown = TRUE)


# Run DiD regression
a <- feols(performance ~ treatment * after , data = did_data, vcov = "hetero" ) 

b <- feols(performance ~ treatment:after | student + year , data = did_data)

# Output/Export table 
 etable( a,b,
          title    = "Effect of policy change on student grades",
          digits   = 3,
          tex      = TRUE,
          fitstat  = ~ar2 + n,
          replace = T,
          style.tex = style.tex("aer"),
          highlight = .("rowcol, #F7AB64, se" = "treatment:after"),
          coef.just = "l",
          placement = "h!",
          order = c("!Constant", "^treatment:after$", "treatment", "after"),
          headers = list("OLS" = 1, "Fixed effects" = 1),
          view = T)
  )

Difference-in-differences

Averages versus DiD OLS regression

Effect of policy change on student grades
After Before Difference
Treated group 7.11 6.02 1.09
Control group 7.49 7.51 -0.02
Difference -0.37 -1.49 1.11

Difference-in-differences

Treatment effect in “event-time”

feols(performance ~ i(year, treatment, -1) | student + year , data = did_data) %>% 
  etable(.,
          #title    = "Effect of policy change on student grades",
          digits   = 3,
          tex      = TRUE,
          fitstat  = ~ar2 + n,
          replace = T,
          style.tex = style.tex("aer"),
          highlight = .("rowcol, #C9DBEA, se" = c("year::-2:treatment"),
                        "rowcol, #E7E8EE, se" = c("year::1:treatment","year::2:treatment")),
          coef.just = "l",
          placement = "h!",
          view = T
         )
feols(performance ~ i(year, treatment, -1) | student + year , data = did_data) %>% 
ggiplot(
    ref.line = -1,
    main = "",
    xlab = "Time to treatment",
    multi_style = "facet",
    geom_style = "ribbon",
    col = '#F7AB64',
    #facet_args = list(labeller = labeller(id = \(x) gsub(".*: ", "", x))),
    theme = ggthemes::theme_hc(base_size = 18) +
        theme(
            text = element_text(),
            plot.title = element_text(hjust = 0.5),
            legend.position = "none"
        )
)

Take aways

Take aways

Randomized trails

  • Simple comparisons of averages can be misleading
    • Selection bias
  • (controlled) Random assignment assures that the treatment and control group are similar
  • Controlled experiments sometimes face a trade-off between internal and external validity

Take aways

Difference-in-differences (DiD)

  • Idea: Comparing the average changes of the outcome among units in the treatment group and in the non-treatment group, before and after the intervention
  • Important: Parallel trends assumption
    • Can’t be tested
    • Pre-intervention trends can give indirect evidence

Good luck with your BSc projects!

See you soon …

in our Accounting MSc!

Appendix

References & useful resources

References

Papers and books

Angrist, J.-S., Joshua; Pischke, 2015. Mastering ’metrics : The path from cause to effect. Princeton University Press, Oxford:
Békés, G., Kézdi, G., 2021. Data analysis for business, economics, and policy. Cambridge University Press.
Dunning, T., 2012. Natural experiments in the social sciences: A design-based approach. Cambridge University Press.
Libby, R., Bloomfield, R., Nelson, M.W., 2002. Experimental research in financial accounting. Accounting, organizations and society 27, 775–810.
Marie, O., Zölitz, U., 2017. “High” achievers? Cannabis access and academic performance. The Review of Economic Studies 84, 1210–1237.

Useful resources

Writing

How to review a paper (general advice)

How to review an accounting paper

ERIM journal list

Writing Tips For Economics Research Papers

Useful resources

Textbooks about causal inference

Causal Inference: The Mixtape

The effect: An introduction to research design and causality

Research Design in the Social Sciences: Declaration, Diagnosis, and Redesign

Engeneering journal: Causal inference

Useful resources

General econometrics books

Introduction to Econometrics with R

Useful resources

Resource for code in empirical accounting research (work-in-progress)

Accounting Research: An Introductory Course

Data Science for Economists and Other Animals

Tutorial for Causal Panel Analysis

Tidy Finance with R

Useful resources

How to get tables from R to Word

modelsummary package homepage

Creating publication-ready Word tables in R

Useful resources

Data viz in R

A Gentle Guide to the Grammar of Graphics with ggplot2 (slideshow)

R Graphics Cookbook, 2nd edition

Data Visualization with R

Check all relevant assumptions for a regression model in one go